Goto

Collaborating Authors

 stage 2






VRL3: AData-DrivenFrameworkforVisualDeep ReinforcementLearning

Neural Information Processing Systems

Our framework has three stages: instage 1,we leverage non-RL datasets (e.g. ImageNet) to learn task-agnostic visual representations; in stage 2, we use offline RL data (e.g. a limited number of expert demonstrations) to convert the task-agnostic representations intomorepowerfultask-specific representations; in stage 3, we fine-tune the agent with online RL.



DeepProxyCausalLearninganditsApplicationto ConfoundedBanditPolicyEvaluation

Neural Information Processing Systems

Proxy causal learning (PCL) isamethod forestimating thecausal effectoftreatments on outcomes in the presence of unobserved confounding, usingproxies (structured side information) for the confounder.